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Abstract. In this paper we show how to express RNA tertiary interactions via the concepts 
of tangled diagrams. Tangled diagrams allow to formulate RNA base triples and pseudoknot- 
intoractions and to control the maximum number of mutually crossing arcs. In particular we 
study two subsets of tangled diagrams: 3-noncrossing tangled-diagrams with I vertices of degree 
two and 2-regular, 3-noncrossing partitions (i.e. without arcs of the form + 1)). Our main 
result is an asymptotic formula for the number of 2-regular, 3-noncrossing partitions, denoted 
by P3,2("), 3-noncrossing partitions over [n]. The asymptotic formula is derived by the analytic 
theory of singular difference equations due to Birkhoff-Trjitzinsky. Explicitly, we prove the 
formula P3^2{'n -I- 1) ~ X 8"n~^(l + ci/n-\- C2/'n? + cs/rv^) where K, Ci, i = 1, 2, 3 are constants. 



1. Introduction 



It is well-known that the functional repertoire of RNA is closely related to the variety of its shapes. 
Therefore it is of utmost importance to understand the structural "language" of RNA as this will 
eventually allow for fast folding, identification and discovery of new RNA functionalities. Studies 
of RNA structural motifs at high resolution by NMR and X-ray crystallographic methods provided 
insight into the fundamental forces that give rise to the unique structural characteristics of RNA. 
Non- Watson- Crick purinc-pyrimidinc, purine-purine, and pyrimidinc-pyrimidine base pairing, as 
well as base-phosphate and base-ribose hydrogen bonding, are known to be important forces for 
folding and stabilizing RNA structures [29] . For RNA pseudoknots (viewed as interactions between 
unpaired bases) combinatorial abstractions have led to new interpretations, generating functions 
and enumeration results. Although far from having a complete understanding of RNA pseudoknots 
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conceptual progress has been made in identifying the right concepts of, for instance crossing- 
complexity, which have direct implications for novel RNA pseudoknot folding algorithms. In this 
paper we build on the concepts derived in the context of RNA pseudoknots. 

Before we begin by giving some background on RNA structure, let us remark why "combinatorial 
frameworks" are of central importance for any prediction algorithm. The above mentioned language 
of RNA is tantamount to uniquely specifying each element of the variety of shapes. Any prediction 
involves at some point a search through configurations and has to make sure that shapes are, for 
instance, not counted multiple times. The enumeration of the combinatorial class and analysis of 
its mathematical structure are of fundamental importance for designing such a search procedure. 
The primary sequence of an RNA molecule is its sequence of nucleotides A, G, U and C together 
with the Watson-Crick (A-U, U-A, G-C,C-G) and (U-G, G-U) base pairings. Single stranded 
RNA molecules form helical structures whose bonds satisfy the above base pairing rules and which, 
in many cases, determine their function. Due to the biochemistry of the base pairs stacked base 
pairs, i.e. arcs of the form (i,j), {i — 1, j + 1) have typically a lower minimum free energy than 
crossing arcs. Base stacking is as important in determining RNA conformations as hydrogen 
bonding interactions. With the noncanonical interactions, many single-stranded loop regions such 
as hairpin loops, bulge loops, and internal loops fold into well-defined secondary structures. The 
prediction of RNA secondary structure is of complexity 0{n^) in time and 0{n?) in space for a 
sequence of length n [34t [35] which is result from the fact that no two bonds can cross. 
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Figure 1 . The idea behind the notion of 3-noncrossing RNA structures, (a) secondary 
structure (with isolated labels 3,7,8,10), (b) bi-secondary structure [18], 2,9 being iso- 
lated (c) 3-noncrossing structure, which is not a bi-secondary structure. In fact, this is 
the smallest 3-noncrossing RNA structure which is not a bi-secondary structure. 
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While the concept of secondary structure is of fundamental importance, it is well-known that there 
exist additional types of nucleotide interactions [T]. These bonds are called pseudoknots [26] and 
occur in functional RNA (RNAseP [21]), ribosomal RNA [23] and are conserved in the catalytic core 
of group I introns. Stadler et al. [18] suggested a class of RNA pseudoknots called bi-secondary 
structures which are essentially "superpositions" of the arcs of two "secondary structures" and 
accordingly generalize from outer-planar to planar graphs, see Figure [T] Prediction algorithms for 
RNA pseudoknot structures arc much harder to derive since there exists no a priori recursion and 
the subadditivity of local solutions is not guaranteed. The key for enumerating RNA pseudoknot 
structures is their categorization in terms of the maximal size of sets of mutually crossing bonds |19j , 
i.e. the notion of fc-noncrossing structures. To be precise, it is the inherent locality of the property 
"fc-noncrossing" that allows for their enumeration by lattice paths. The diagram representation of a 
structure illustrates what fc-noncrossing means, see Figure[l] In a diagram all nucleotides are drawn 
horizontally and the backbone bonds are ignored, then all bonds are drawn as arcs in the upper 
half-plane. The number of 3-noncrossing RNA structures satisfies 83(71) ~ n{n^i)'^\n~'i) (^^^2^) 
[21], however, it is not the exponential growth rate of (-^i^^) but the inherent non-recursiveness 
which makes the prediction difficult. 




Figure 2. HIV-2 TAR, [3]. In HIV-2 TAR we have a (C38-G27) ■ C23+ triple mutant. 
Improved NMR spectral properties of HIV-2 TAR allowed the observation of the C23 
amino and imino protons, providing direct evidence of hydrogen bonding interaction. 
The tertiary interaction is a tangled-diagram of with one vertex of degree two. 



A first step towards RNA-tertiary structures beyond pseudoknot interactions consists in consid- 
ering single strands interacting with helical regions by forming tertiary contacts with base-paired 
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nucleotides of the helices. Nucleotide triples occur when single-stranded nucleotides form hydro- 
gen bonds with nucleotides that are already base paired. This hydrogen bonds can involve bases, 
sugars and phosphates. These interactions function to orient regions of secondary structures in 
large RNA molecules and to stabilize RNA three-dimensional structures. Base triples are a special 
case of nucleotide triple interactions in which base-base hydrogen bonding occurs. Single-stranded 
nucleotides can interact with base paired nucleotides via either the major groove or the minor 
groove of duplex regions. Nucleotide triples have been shown or proposed to form at junctions of 
coaxially stacked RNA helices that have adjacent single-stranded regions [29l[l0j. Several major 
groove triples are present in tRNA where they function to stabilize its L-shaped three-dimensional 
structure. These interactions require to consider tangled diagrams [8], i.e. diagrams with vertices 
of degree < 2 which exhibit a variety of arc configurations, see Section [2l This variety is motivated 
from nucleotide interactions observed in RNA structures. In Figure [5] we show the HIV-2 TAR 
(C38-G27) • 023+ triple mutant structure as a tangled-diagram. Let us next have a closer look 
at the hammerhead structure-motif [TU] in Figure [31 Comparing Figure [2] with Figure [3] reveals 
one feature of the hammerhead motif. It exhibits a lefthand-endpoint of degree 2 (incident to the 
dashed arc) while all other vertices of degree 2 are Icft-and righthand-endpoints. These two exam- 
ples indicate that the majority of the bonds is organized in helical regions, where Watson-Crick 
and G-U(U-G) base pairs are stacked, additional stacks can be realized forming pseudoknots. 




Figure 3. Diagram representation of the hammerhead ribozyme [TO], which can be 
represented as a tangled-diagrams with two vertices of degree two. The gap after C25 
indicates that some nucleotides are omitted, which are involved in an unrelated structural 
motif. 
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Finally in Figure 2] we display the catalytic core region of the group I self-splicing intron \W\ . In 
order to express tertiary interactions we consider tangled-diagrams introduced in [8] , which capture 
the nucleotide interactions relevant for the tertiary structure of the molecule [1^ . 




Figure 4. Catalytic core region of the group I self-splicing intron [9] corresponds to 
a tangled-diagram with six vertices of degree two. The gaps after G54, U72, G103 and 
A112 indicate that some nucleotides are omitted which are involved in an unrelated 
structural motif. 



We will discuss two combinatorial frameworks arising from tangled-diagrams [8] , both being suited 
for expressing RNA tertiary interactions. The first is the set of tangled-diagrams with fixed number 
of vertices of degree 2 and the second the set of 2-regular fc-noncrossing partitions. While the 
former can easily be enumerated the latter requires more work. 2-regular fc-noncrossing partitions 
evade lattice path enumeration due to their inherent asymmetry (lacking arcs of length 1). The 
"straightforward" ansatz via Inclusion-exclusion applied to the set of all fc-noncrossing partitions 
revealed a connection between seemingly unrelated combinatorial objects: partitions and enhanced 
partitions, enumerated by Bousquet-Melou and Xin [Hl^n]- In Lemma [T] [50] we show how this 
relation can be used to obtain the enumeration. Subsequently, we prove the following a simple 
formula for the numbers of 2-regular fc-noncrossing partitions 



(1.1) 



P3,2(?T. + 1) ^ K 8"n ^(1 + ci/n + C2/n^ + cs/n^) 
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where K = 6686.408973, ci = -28, C2 = 455.77778 and C3 = -5651.160494. As for the quahty of 
approxhnation we present the sub-exponential factors in the tabic below, where = Kn~'^ [1 + 
ci/n + C2/n^ + ca/n^). 



The Sub-exponential Factor 
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P3(n)/8" 
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Our analysis is based on the theory of Birkhoff-Trjitzinsky, which seems to be somewhat overlooked. 
While the two original papers [3 [6] are hard to read, the paper of Wimp and Zeilberger [33j provides 
a good introduction and shows via various examples of how to apply the theory. Since the method 
(if it applies) is quite powerful we give an overview of the analytic theory of singular difference 
equations in the Appendix. 



2. Vacillating tableaux and tangled-diagrams 

2.1. Tangled-diagrams. A tangled-diagram over [n] is a triple of sets (V,i?, i^), where F is a 
finite non-empty set of n elements called vertices, is a set of unordered pairs of vertices called 
arcs and F is the flag set whose elements are the 2-degree points such that they are the ends of 
two crossing arcs, represented by drawing its vertices in a horizontal line and its arcs (i, j) in the 
upper halfplane with the following basic configurations and the isolated points 
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Composing these motifs we obtain a tangled-diagram, for instance, the tangled-diagram 
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1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 



has V — [23] and F ~ {1, 18,23}. Let us introduce several important subclasses of 3-noncrossing 
tangled-diagrams: 

(1) 3-noncrossing matchings with isolated points are 3-noncrossing tangled-diagram in which each 
vertex has degree at most 1. For instance, RNA pseudoknot structures are 3-noncrossing 
matchings with isolated points, see Figure [3 (2) 2-regular, 3-noncrossing partitions. A partition 




Figure 5. We denote the backbone by the blue line and bonds by black lines. 



corresponds to a tangled-diagram in which any vertex of degree two, j, is incident to the arcs 
and (.7, s), where i < j < s, for instance, see Figure [Hand Figure [SI (a). Partitions without arcs 
of the form (i, i + 1) are called 2-regular, partitions. (3) 3-noncrossing braids without isolated points 
are tangled-diagrams in which all vertices, j of degree two are either incident to loops (j, j) or 
crossing arcs and {j,h), where i < j < h, see Figure [H (b). 



(4) 3-noncrossing diagrams with C. vertices of degree 2. Figure [2l Figure [3] and Figure [4] are 3- 
noncrossing tangled-diagrams with £ ~ 1,2,6 vertices of degree 2. The following tangle-diagram 
shows all 4 basic types of degree 2 vertices in tangled diagrams. 



8 



JING QIN AND CHRISTIAN M. REIDYS * 



1 2 3 4 5 6 7 



1 2 3 4 5 6 



1 2 3 4 5 6 



(a) 



m 



Figure 6. 
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In the following, we study the subclasses (2) and (4) since they represent a natural framework for 
RNA tertiary interactions. It turns out that (3) is of importance since it facilitates the enumeration 
of (2). To be precise it is shown in |20j that there is a duality between fc-noncrossing braids without 
isolated points and 2-regular fc-noncrossing partitions. 

Having introduced the combinatorial framework, one key question is how to enumerate the sub- 
classes (2) and (4). The enumeration is facilitated via a bijection between the tangled-diagrams 
and certain lattice paths. To derive the latter a bijection between tangled-diagrams and (general- 
ized) vacillating tableaux is constructed. It is then easy to see that vacillating tableaux correspond 
to lattice paths. In the next Section we provide some background on vacillating tableaux and the 
bijection. 

2.2. Vacillating tableaux. A Young diagram (shape) is a collection of squares arranged in left- 
justified rows with weakly decreasing number of boxes in each row. A Young tableau is a filling 
of the squares by numbers which is weakly decreasing in each row and strictly decreasing in each 
column. A tableau is called standard if each entry occurs exactly once. A tableau-sequence is 
a sequence = /-t",//^, = of standard Young diagrams, such that for 1 < i < n, is 
obtained from by cither adding one square, removing one square or doing nothing. 
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The RSK-algorithm is a process of row-inserting elements into a tableau. Suppose we want to 
insert k into a standard Young tableau A. Let Xij denote the element in the z-th row and j-th 
column of the Young tableau. Let i be the largest integer such that Ai,i_i < k. (If Ai,i > fc, then 
i = 1.) If XiA does not exist, then simply add k at the end of the first row. Otherwise, if Ai,i 
exists, then replace Ai^i by k. Next insert Ai^i into the second row following the above procedure 
and continue until an clement is inserted at the end of a row. As a result wc obtain a new standard 
Young tableau with k included. For instance inserting the number sequence 5, 2, 4, 1, 6, 3 starting 
with an empty shape yields the following sequence of standard Young tableaux: 



+5 



+i 



+1 







3 — 
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■1 
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A vacillating tableaux [5] V^"^ of shape A and length 2n is a sequence (A°, A^, . . . , A^") of shapes 
such that (i) A° = and A^" = A, and (ii) {X^'-^X"^^) is derived from X'^'-'^, for 1 < i < n by either 
(0,0): doing nothing twice; (—□,0): first removing a square then doing nothing; (0,+n): first 
doing nothing then adding a square; (±n,±n): adding/removing a square at the odd and even 
steps, respectively. Let V^" denote the set of vacillating tableaux. For instance, let us consider the 
following vacillating tableaux: 



i-HX+af i.Q,-Di i-a.,oj id.+ui 



-J| fD? I+L., -Ll| K,>!> 



2.3. A bijection between vacillating tableaux and tangled-diagrams. When constructing 
the bijection between vacillating tableaux and tangled-diagrams in Theorem [1] below, the notion of 
the inflation of a tangled-diagram is important. Wc arc now able to discuss the bijection between 
vacillating tableaux and tangled diagrams. 

Theorem 1. [5] There exists a bijection between the set of vacillating tableaux of shape and 
length 2n, V^" and the set of tangled- diagrams over n vertices, S„ 

(2.1) /3:V|"^g„. 
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1 1* 2 2' 3 3* 4 r 5 6 6* 7 7* 8 8* 9 10 



Figure 7. The inflation map: each vertex i of degree 2 is replaced by a pair of vertices, 
each incident to an respective arc. 



Theorem 1 
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Figure 8. From tangled-diagrams to lattice paths. First the tangled-diagram (up- 
per left) is resolved into its vacillating tableaux (upper right). Reading the numbers of 
squares in the corresponding rows (bottom right) induces the 27i-step lattice path (bot- 
tom right), which starts and ends in (1, 0). The path has (green points), -f □ and — □ 
(red and purple points) induced by the pair steps (0, -(-□), (— 0) and ( — □, — □). Note 
that the lattice path does not touch the "wall" x — y. 

Furthermore a tangled- diagram Gn is k-noncrossing if and only if all shapes A* in its vacillating 
tableaux have less than k rows. That is (f>: V'^ — > S„ maps vacillating tableaux having less than 
k rows into k-noncrossing tangled- diagrams. 



The proof of Theorem [T] rehes on the idea to resolve the vertices of degree 2 via an inflation, 
i.e. vertex i is resolved by the pair (i, i'), where we utilize the linear order l<l'<2<2'<---< 
(n — 1) < (n — 1)' < n < n' . The inflation transforms each tangled-diagram into a partial matching 
with isolated points. For instance. Restricting the steps for vacillating tableaux produces the 
bijections of Chen et.al [7|. Let Mfe(n), 5'fe(??) and 'Bj.(n) denote the set of fc-noncrossing matchings 
[32j , partitions and braids without isolated points over [n] , respectively. Theorem [T] basically says 
the tableaux-sequences Mfc(n), ?&(«) and 'B\,{n) are composed by the elements in S'jvtfc , S'^^ and 
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SrgU respectively, where 

Sm, = {(-□„, 0),(0,+n,)} 

S^, = {(-□,„ 0),(0, +□,,), (0,0), (-□,„+□,)} 

= {(-□,„ 0),(0, +□,,),(+□,„-□;)}! <^;< A: -1 

and itD/i denote the adding or subtracting of the rightmost square " D/j " in the hth row in a given 
shape A and let " " denote doing nothing. To get some intuition above the particular steps and 
diagram-configurations let us show the key correspondences between tableaux and diagram-motifs 

, , /?v^. /79^\ /?^^\ 

EHQCD B°B ™°B B°™ 



3. fc-NONCROSSING TANGLED DIAGRAMS AND 2-REGULAR, fc-NONCROSSING PARTITIONS 



In this section we prove two enumeration results. We give explicit formulas for fc-noncrossing 
tangled diagrams with a fixed number of degree 2 vertices and 2-regular fc-noncrossing partitions. 
Since the latter formula is quite complicated we provide a simple asymptotic expression in Section^) 

Let fk(n) denote the number of perfect matching over [n] and Cm be the Catalan number. Our 
first result reads 

Theorem 2. The number of the k-noncrossing tangled- diagrams over [n] with £ vertices of degree 
two, denoted by di^k{n) is given by 

and in particular for k ~ 3 we have 

Proof. Let Di,i,k denote the set of tangled-diagrams over [n] with i isolated points and i vertices 
of degree two and c?i/_fc = IDi/^fej. There are (") ("7*) ways to choose the locations of the isolated 
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points and the vertices of degree two. Furthermore for an arbitrary tangled-diagram over V ~ [n] 
with i isolated points Vi = {vi,V2, ■ ■ ■ , ih} C V and £ vertices of degree two V2 = {vi+i, ■ ■ ■ , C 
V, \et V = V \ {Vi U V2) = {wi+f+i, . . . , Vn} be the set of vertices of degree one, via the inflation 
we will have a perfect matching over [\{V2 U V2 U V}\] — [2£ + n — i — £] = [n — i + £], where 
^2 = {"^1+1 1 ■ ■ ■ I ^i+e}- S^T^ce di^k = Z^Lo dt,e,k, the theorem follows. □ 



The first 10 number for di^i^^ for £ — 1,2,3 and n = 1 ... 10 are given by 



£,n 
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10 
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12 


40 


165 


606 


2380 


9136 


36099 


142750 


2 





3 


9 


102 


450 


2565 


11823 


57876 


266220 


1243170 


3 








14 


56 


980 


5320 


38920 


214144 


1251852 


6672120 



We proceed by enumerating 2-regular /c-noncrossing partitions. A valid approach for this consists 
in building on the enumeration results of [1] for fc-noncrossing partitions using the inclusion- 
exclusion principle. This strategy leads to functional equations which prove that the asymptotic 
formulas of 2-regular fc-noncrossing partitions and braids without isolated points coincide. But 
braids can be enumerated via kernel methods |221 1111 113] directly, while 2-regular fc-noncrossing 
partitions cannot. This suggests an alternative ansatz [5D], by directly establishing a relation 
between partitions and braids and consequently enumerating partitions via braids. In Lemma [1] 
below we show this correspondence. To this end we replace in a braid without isolated points each 
loop by an isolated vertex and each pair of crossing arcs at a degree 2 vertex by noncrossing arcs, 
i.e. 



□ ma aga mpg BP™ 



A COMBINATORIAL FRAMEWORK FOR RNA TERTIARY INTERACTION 



13 




Accordingly, we can identify braids without isolated points with a subset of 3-noncrossing parti- 
tions. 

Lemma 1. [20j Let A: G N, A: > 3. Then we have the bijection 



where ■& has the following property: for any tt G CPfe(n) holds: is an arc of n if and only if 

{i,j ~ 1) is an arc in 7?(7r). 

Proof. By construction, i? maps tangled-diagrams over [n] into tangled diagrams over [ri — 1]. Since 
there exist no arcs of the form (z, i + 1), i9(7r) is, for any tt G CPfc,2('T') loop-free. By construction, d 
preserves the orientation of arcs, whence ^{tt) is a partition. 
Claim, i): 'J'k.2{'n) — > 23|,(?i — 1) is well-defined. 

We first prove that i?(7r) is fc-noncrossing. Suppose there exist k mutually crossing arcs, {is,js)i 
s — 1, . . . , k in i!){tt). Since 'd{Tr) is a partition we have ii < ■ ■ ■ < ii; < ji < ■ ■ ■ < j^. Accordingly, 
we obtain for the partition tt G J'fe.2('^) the k arcs {is, is + !)■ s ~ 1, . . . , /c where ii < • ■ • < < 
ji -I- 1 < ■ • • < jfe + 1, which is impossible since tt is /c-noncrossing. We next show that diji) is a 
/c-noncrossing braid. If d{T:) is not a fc-noncrossing braid, then according to eq. (j3.ip d{-K) contains 
k arcs of the form (ii, ji), . . . (ik,jk) such that ii < ■ ■ ■ < ik = ji < ■ ■ ■ < jk holds. Then tt contains 
the arcs (ii, ji -I- 1), (ik,jk + 1) where ii < • • • < ife < ji + 1 < • • • < jfc + 1, which is impossible 
since these arcs are a set of k mutually crossing arcs and the claim follows. 
Claim, -d is bijective. 

Clearly ?? is injective and it remains to prove surjectivity. For any fc-iioncrossing braid S there 
exists some 2-regular partition tt such that ■d{Tr) = 6. Wc have to show that tt is fc-noncrossing. Let 
M' = {(«!, Ji), . . . , (ikjjk)} be a set of fc mutually crossing arcs, i.e. ii < ■ ■ ■ < i^ < ji < ■ ■ ■ < j^. 
Then we have in t?(7r) the arcs {is, js — 1), s = 1, . . . , fc and ii < ■ ■ • < < ji — 1 < ■ • • < jfe — 1. 
If AI — {(iiiii ~ l)j • ■ • I (ifcj JA; ~ 1)} is fc-noncrossing then wc conclude ik = ji — 1. Therefore 
M = — 1), ... , {ik,jk ~ 1)}, where ik = ji — 1 which is, in view of cq. (|3.ip impossible in 

fc-noncrossing braids. By transposition we have thus proved that any z?-preimage is necessarily a 



(3.1) 
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Figure 9. The bijection -ff: J'fc,2(n) — > ^^.(n — 1). Crossings are reduced by contract- 
ing the arcs. 

fc-noncrossing partition, whence the claim and the proof of the lemma is complete. 

□ 

As an illustration of the bijection of Lemma [T] we display Via Lemma [T] we have reduced the 
enumeration of 2-regular fc-noncrossing partitions to that of braids without isolated points. Let 
us discuss how the latter can be enumerated via lattice paths. From Theorem [1] (see Figure [273)) 
we know that a 3-noncrossing braid corresponds to a lattice paths in the first quadrant with the 
following properties: 

(1) the path starts and ends at (1,0), 

(2) each step pair (2i — 1, 2i), where 1 < i < n is an element of 

{(0, +ei), (0, +62), (-ei, 0), (-62, 0), (+ei, -ei), (+62, -62), (+61,-62), (+62, -61)} . 

(3) the path never touches the wall x = y. 

The key result facilitating the enumeration is the reflection principle due D. Andre in 1887 [2] and 
subsequently generalized by Gessel and Zeilberger [M]. It is worth mentioning that this strategy 
is nonconstructive since enumeration is obtained by counting all paths and having paths touching 
the wall cancel each other. 

Theorem 3. (Reflection-Principle) [M] Suppose § G {Ms, 73,^,3} and let n^^'°\2n) denote the 
number of § -walks of length 2n from (1,0) to (1,0) that remain in the region R ~ {(x,y) \ x > 
y > 0, (x,?/)£Z^}. Let furthermore f'^^ 'ys^\2n) be the number of §-walks from {x,y) to (x\y') of 
length 2n that remain in the first quadrant. Then we have 

(3.2) (2n) = fll;°l (2n) - /(°;,^,' (2n) . 

Proof. Suppose 7 is a S-walk starting and ending at (1,0) which remains in the first quadrant and 
that touches the diagonal x = y. Let (a, a) be the first point where 7 touches the diagonal y = x. 
Reflect all steps of 7 after 7 touched the diagonal in (a, a) and denote the resulting walk by 7'. 
Then 7' is a §-walk starting from (1, 0) and ending at (0, 1). This procedure yields a unique pair 
(7, 7') for each §-walk 7 starting and ending at (1, 0) which remains in the first quadrant and that 
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— *• original patli 

^^♦> reflected patli 

9 stay step in original patin 

# stay step in reflection patin 

• reflection point 



a 1 2 3 4 5 6 

Figure 10. The reflection principle. The original lattice path(blue) starting and 
ending at (1,0) touches the wall x = y at (3, 3) for the first time. The correspond- 
ing reflected path(red) starts at (1,0) and ends at (0, 1) obtained by reflecting all 
steps after (3, 3) w.r.t. the wall x ~ y. 

touches the diagonal x = y. According to eq. (j3.2|l these pairs cancel themselves and only the 
paths that never touch the diagonal remain, whence the theorem. □ 



Using the reflection principle we can enumerate braids via the kernel method [221 1111 113] . In fact 
these computation have been obtained by [4] who enumerated enhanced partitions. Our second 
result reads 

Theorem 4. The number of 2-regular, 3-noncrossing partitions is given by 



P3,2{n + 1) 



^ [/3„(1, 0, s) ~ /3„(1, -1, s) - /3„(1, -4, s) + /3„(1, -3, s) 



-/3„(3, 4, s) + /3„(3, 3, s) + /3„(3, 0, s) - /3„(3, 1, s) 
+/3„(2, 5, s) - /3„(2, 4, s) - /3„(2, 1, s) + /3„(2, 2, s))] , 




Furthermore p3^2in) satisfies the recursion 



(3.3) 



ai{n)p3^2{n + 1) + a2("-)P3,2("- + 2) + asC"-) P3,2(n- + 3) - Q!4(") P3,2('^ + 4) = , 
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where 

ai{n) = 8(n + 2)(n + 3)(n + l) 

a2{n) = 3(n + 2)(5n^ +47n+ 104) 

a^in) = 3(n + 4)(2n + ll)(n + 7) 

a4(n) = (n + 9)(7i + 8)(n + 7) . 

For instance, the first 12 numbers of 2-regular. 3-noncrossing partitions are given by 



n 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


P3,2(n) 


1 


1 


2 


5 


15 


51 


191 


772 


3320 


15032 


71084 


348889 



We will show in the next section that the formulas given in Theorem 2] have simple asymptotic 
formulas. 

4. Asymptotic analysis 

In this section we employ the particularly elegant theory of singular difference equations due to 
Birkhoff and Trjitzinsky [6]. The theory of Birkhoff-Trjitzinsky establishes form, existence and 
properties of such fundamental sets in general, and will be discussed in the Appendix. For our 
purposes it suffices to identify the unique, monotonously increasing formal series solution (FSS). 

Theorem 5. There exists some real constants A' > and 01,02,03 such that 

(4.1) p3,2(n + 1) - A' 8"n-^(l + oi/n + ca/n^ + 03/n^) 

holds. Explicitly, we have K = 6686.408973, ci = -28, 02 = 455.77778 and C3 = -5651.160494. 

Proof. Claim. There exists some A > and 01,02,03 .. . such that 

(4.2) p3^2{n + 1) - A 8"7i-^(l + oi/n + 02/n^ + 03/n^ ■■■). 

Theorem [6] guarantees the existence of 3 linearly independent formal series solutions (FSS) for 
eq. (|3.3p . We proceed by constructing these using the following ansatz for p3^2{n): 

(4.3) P3^2{n + 1) = E{n)K{n) E{n) = eWninn+Mm^s 



A COMBINATORIAL FRAMEWORK FOR RNA TERTIARY INTERACTION 



17 



where 

(4.4) K{n) = cxp{ain''+"="''"'''''* ■ }, ai ^ 0, /3 = j/p, < j < p. 

We immediately derive setting A = et^o+fj.i 

P3,2(n + fc + l) ^ ^^ofe^fc ^ fcg + Ppo/2 ^ ^ ^ 
P3,2(n + 1) n 

exp{ai/?fcn''-i + a2(/3 - i)fcn'3-i/p-i+-}. 

We arrive at 

= 1+^{1 + ^ + + f + . . . + + _ l/p)n^-VP-i + ...) + ...} 



8 ' n 

3 ^ + + f ^ . . . |^2|i ^ (2«i/?,/-i + 2a2(/3 - l/p)n^-'/''-' + •••) + ••■} 

4 n 



First we consider the maximum power of n, which is zero. In view we obtain 

Po = 0. This impHcs p ~ 1 since p > 1 and p should be the smallest integer s.t. ppa G N. Equating 
the constant terms again, we obtain that A is indeed a root of the cubic polynomial P{X) 

^ ' 8 4 8 

Therefore we have A = 8 or —1. Notice that < [3 < 1 implies [3 = 0. Otherwise, equating 
the coefficient of n^^^ implies ai = 0, which is impossible. It remains to compute 9. For this 
purpose we equate the coefficient of n"!, i.e. 8^(0 + f ) + S^K^ + 20) - 8^^(18 + 30) = from 
which we conclude = —7. Since P3.2in) is monotone increasing P3,2('^) coincides with the only 
monotonously increasing FSS, given by 

(4.5) P3,2(n + 1) - • 8" • 71-^(1 + ci/n + C2/n^ + cs/n^ ■■■) 

for some > and constants ci,C2,C3 and the proof of the claim is complete. We compute 
ci = —28, C2 = 455.778 and C3 = —5651.160494 by equating the coefficients of n~^, and 
(2268 + 81ci = 0, 1683ci + 162c2 - 26712 = and -32547ci + 729c2 + 129654 + 243c3 = 0) and 
finally get K = 6686.408973 numerically to complete the proof of the theorem. □ 
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5. Appendix 



5.1. The BirkhofF-Trjitzinsky theory. Any difference equation with rational coefficients 
can be written as 

m 

(5.1) ^C^(n)2/(n + /i) =0 Co(n) = 1, Cm{n)j^O, n = 0,1,2,... 

where the coefficients possess representations as generalized Poincare series 



(5.2) Chin) ~ 71- 



coji + cijiU - + C2,^i7^ + 



/i = l,2,. 



Here K is an integer, uj is an integer > 1 independent of h and cqj, =/= unless C'h{n) = 0. We 
shall assume that ui is minimal. A set of functions z^-'^(7i) is called linearly independent if the 
determinant 

(5.3) V n e N U {0}; det (z^^'+D (n + z))o<,,,<„_i ^ 0. 



The classical theory of difference equations asserts that eq. (|5.ip possesses a set of linearly inde- 
pendent solutions constituting a basis of the solution space. Such a set is called a fundamental set. 
The Birkhoff-Trjitzinsky theory proves that there exists a fundamental set in which all elements 
have an asymptotic expansion consisting of an exponential leading term multiplied by a linear 
combination of descending scries of the form cq. (j5.2p . To provide the notion of formal series 
solution and Birkhoff series we set 



p t 
p+i-j 



i=i 3=0 



where p, r^, poP a-rc integers, p > 1, p,j, 0, bgj G C, 6o.j 7^ 0, unless bsj = for s = 0, 1, 2, . . . , j'q = 0, 
— TT < Im(/ii) < TT. Then we call 

yip,n) = eQ^^-"hip,n) 

a formal series solution (FSS) of eq. (|5.ip if and only if substituted in cq. (|5.ip after dividing by 
gQ(p.") and corresponding algebraic transformations, the coefficients of 

n^+'p^^ \n{ny , r, s ^ 0,1, . . . ,t r, s = 0, ±1, ±2, . . . , 

are equal to zero. For given sequence {f{n))n>a we furthermore call 

(5.4) f{n) e'^^P'''h{p,n) 
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the Birkhoff series for f{n) if and only if for every fc > 1 there exist bounded functions Akj{n), 
j = 0, 1, . . . , t, such that 



(5.5) e-'5(p^")n-e/(n) ^ "^ln{ny ^bsjuf + n-^^ J^H^V ^kjin) 

j=0 s=0 j=0 

Following [33] we define 

(5.6) «;fe = det(e'?^+^('''"+^)s,+i(p,7^ + z))o<,^^.<,_i . 



The main result of the Birkhoff- Trjitzinsky theory can now be stated as follows 

Theorem 6. [SUHj There exist exactly m FSS of eq. i5.1\) of type e'^^P'^^\s[p, n) where p = vlj for 
some integer v > 1 and each FSS represents asymptotically some solution of the equation. The 
above FSS are, up to multiplicative constants, unique and the m solutions so represented constitute 
a fundamental set for the equation. 



Acknowledgments. We are grateful to Emma Y. Jin for helpful discussions. This work was 
supported by the 973 Project, the PCSIRT Project of the Ministry of Education, the Ministry of 
Science and Technology, and the National Science Foundation of China. 



References 



1. Mapping RNA form and function. Science, 2, 2005. 

2. D. Andre Solution directed du probleeme, resolu par M. Bertrand. C. R. Acad. Sci. Paris 105(1887) 436-437. 

3. Alexander S. Brodsky, Heidi A. Erlacher and James R. Williamson, NMR evidence for a base triple in the 
HIV-2 TAR C-G • C+ mutant- argininamide complex. Nucleic Acids Research, 26 (1998), No. 8, 1991-1995. 

4. Mireille Bousquet-Melou and Guoce Xin, On partitions avoiding 3-crossings. Scminairc Lotharingicn do Com- 
binatoire, 54 (2006), Article B54c. 

5. George D. Birkhoff, Formal theory of irregular difference equations Acta Math. 54 (1930), 205-246. 

6. George D. Birkhoff and W. J. Trjitzinsky, Analytic theory of singular difference equations Acta Math., 60 
(1932), 1-89. 

7. William Y. C. Chen, Eva Y.P. Deng, Rosena R.X. Du, Richard P. Stanley and Catherine H. Yan, Crossings 
and Nestings of Matchings and Partitions. Trans. Amer. Math. Soc. 359 (2007), No. 4, 1555-1575. 

8. William Y. C. Chen, Jing Qin and Christian M. Reidys, Crossings and Nestings of tangled- diagrams. Submitted. 

9. Michael Chastain and Ignacio Tinoco, Jr., A Base-triple Structural Domain in RNA Biochemistry, 31 (1992), 
12733-12741. 



20 



JING QIN AND CHRISTIAN M. REIDYS * 



10. Robert T. Batcy, Robert P. Rambo, and Jennifer A. Doudna, Tertiary Motifs in RNA Structure and Folding 
Angew. Chem. Int. Ed., 38 (1999), 2326-2343. 

11. C. Banderier, M. Bousquet-Melou, A. Denise, P. Flajolet, D. Gardy, and D. Gouyou-Beuchamps, Generating 
functions of generating trees, Discrete mathematics, 246 (2002), no. 1-3, 29-55. 

12. A. Berele, A Schensted-type correspondence for the symplectic group, J. Combinatorial Theory (A), 43 (1986), 
320-328. 

13. G. Fayolle, R. lasnogorodske, and V. Malyshev., Random walks in the quarter-plane: Algebraic methods, bound- 
ary value problems and applications, volume 40 of Applications of Mathematics. Springer- Verlag, Berlin, 1999. 

14. . M. Gessel and D. Zeilberger, Random walk in a Weyl chamber, Proc. Amer. Math. Soc. 115 (1992), 27-31. 

15. D. Gouyou-Beauschamps, Standard Young tableaux of height 4 cii^d 5, Europ. J. Combin., 10 (1989), 69-82. 

16. D. Grabiner and P. Magyar, Random walks in Weyl chambers and the decomposition of tensor powers, J. Alg. 
Combinatorics, 2 (1993), 239-260. 

17. C. Greene, An extension of Schensted's theorem, Adv. Math., 14 (1974), 254-265. 

18. C. Haslinger and P.P. Stadler, RNA Structures with Pseudo-Knots. Bull.Math.Biol, 61 (1999) 437-467. 

19. E.Y. Jin, J. Qin, and CM. Reidys, Combinatorics of RNA structures with pseudoknots, Bull.Math.Biol., 2007. 
in press. 

20. E.Y. Jin, J. Qin, and CM. Reidys, On k-noncrossing partitions, submited. 

21. E.Y. Jin and CM. Reidys, Asymptotic enumeration of RNA structures with Pseudoknots, Bull.Math.Biol., 
2007. in press. 

22. D.E. Knuth, The art of computer programming, vol. 1: Fundamental Algorithms, Addison- Wesley, 1973, Third 
edition, 1997. 

23. D.A.M. Konings and R.R. Gutell, A comparison of thermodynamic foldings with comparatively derived struc- 
tures of 16s and Ws-like rRNAs, RNA, 1 (1995), 559-574. 

24. A. Loriaand T. Pan, Domain structure of the ribozyme from eubacterial ribonuclease p, RNA, 2 (1996), 551-563. 

25. S. G. Mohanty, Lattice Path Counting and Applications, Academic Press, New York, 1979. 

26. E. Westhof and L. Jaeger RNA pseudoknots. Current Opinion Struct. Biol., 2 (1992), 327-333. 

27. M. Petovsek, H.S. Will and D. Zeiberger. A=B. A K Peter Ltd., Wellesey, MA, 1996. 

28. C. E. Schensted, Longest increasing and decreasing subsequences, Canad. J. Math., 13 (1961), 179—191. 

29. L. X. Shen and Zhuoping Cai and Ignacio Tinoco. 3r,RNA structure at high resolution, FASEB J., 9 (1995), 
1023-1033. 

30. Sona Sivakova and Stuart J. Rowan, Nucleobases as supramolecular motifs Chem. Soc. Rev., 34 (2005), 9-21. 

31. R. Stanley, Enumerative Combinatorics, vol. 1, Wadsworth and Brooks/Cole, Pacific Grove, CA, 1986; second 
printing, Cambridge University Press, Cambridge, 1996. 

32. R. Stanley, Enumerative Combinatorics, vol. 2, Cambridge University Press, Cambridge, 1999. 

33. Jet Wimp and Doron Zeilberger, Resurrecting the asymptotics of linear recurrences. Journal of Mathmatical 
analysis and applications, 3 (1985), 162-176. 

34. M. Zuker and P. Stiegler, Optimal computer folding of large RNA sequence using thermodynamics and auxiliary 
informations, Nucl. Acid Res., 9 (1981), 133-148. 

35. M. Zuker and D. Sankoff, RNA Secondary Structure and their prediction. Bull. Math. Biol., 46 (1984), 591-621. 



A COMBINATORIAL FRAMEWORK FOR RNA TERTIARY INTERACTION 



21 



Center for Combinatorics, LPMC-TJKLC, Nankai University, Tianjin 300071, P.R. China, Phone: *86- 
22-2350-6800, Fax: *86-22-2350-9272 

E-mail address: reidysanankai.edu.cn 



Inflation 





1 2 



1 



4 5 5* 

Theorem 1 



□ □ 



CD □□ □ 



